QRZ! Ham Radio 9

home *** CD-ROM | disk | FTP | other *** search

/ QRZ! Ham Radio 9 / QRZ Ham Radio Callsign Database - Volume 9.iso / pc / files / t_docs / rspf2.doc < prev next >

Wrap

Text File | 1996-06-24 | 50KB | 963 lines

Fred Goldstein k1io goldstein@aim.dec.com Version 2.0 3-june-1988 The Radio Shortest Path First Routing Algorithm (RSPF) For DDN Internet Protocol over Amateur Packet Radio ** DRAFT ARCHITECTURE -- FOR COMMENT ** CONTENTS I. Introduction II. Acquisition of router-router adjacencies III. Acquisition of end node adjacencies IV. Link state propogation V. The Shortest Path First Spanning Tree Appendix: Router Parameters I. Introduction Amateur packet radio use of the Internet Protocol does not yet provide all of the capabilities of other IP networks. One particular example of this is IP packet routing. Existing versions of the AMPR IP code make use of a static routing table. This requires human intervention every time a new backbone path is added, and adds geographic constraints to address assignment which do not exist on the ARPA Internet. The core ARPAnet has implemented automatic routing based upon Dijkstra's "SPF" (shortest path first) spanning tree algorithm. A similar procedure can be applied to AMPRnet (Net 44). It is called Radio Shortest Packet First (RSPF); this document outlines the RSPF protocol. I.1. Elements of RSPF The RSPF protocol is designed for use by network-layer routing nodes (IP Gateways) in a packet radio network using the DDN Internet Protocol. It is comprised of four major functions: 1) Acquisition of router-router adjacencies 2) Acquisition of end node adjacencies 3) Link state propogation 4) Spanning tree route decision making. Its net result is the automatic maintenance of a least-cost routing table for use by IP Routing. RSPF is optimized for the geographically heirarchical addressing used in AMPRnet, but does not depend upon it. I.2. Addressing convention When an RSPF router communicates with an end node, it will typically deal with a 32 bit IP address. Routers themselves, however, also support node group addressing (fewer than 32 bits need match). This follows the convention in the KA9Q IP routing program, which permits a crude form of heirarchical addressing as well as allowing portable operations to override the defaults. RSPF looks for the match (node or node group) with the greatest number of matching bits. Only if the number of bits matched is equal, then the lower cost path will be used. (Thus a match to a full node address will override a "cheaper" path that matches its "class C net" of 24 bits, which overrides a "class B net", etc., noting that AMPRnet does not follow strict 8-bit address classification, and is a single Class A net.) I.3. Connection-oriented vs. connectionless lower layers IP is a datagram network protocol, and supports both connection-oriented and connectionless lower layers. In a network with a high packet loss rate, the local retransmission provided by a connection-oriented datalink will substantially improve overall throughput. However, in a high-speed dedicated backbone, particularly one implemented using full-duplex radio or wireline links, connectionless may provide better overall performance. The choice of which to use is a local matter; RSPF will work with both. The reliability of the routing information, however, may be somewhat greater with connection-oriented datalink procedures, since these will give more rapid indication of a physical link failure. I.4. Relationship to other protocols The reliability of the network depends upon reasonably reliable transmission of the routing update; hence, for non-broadcast procedures, it is recommended that routers communicate with one another using connected-mode AX.25, or another reliable data link layer. (In any case connected-AX.25 may be more useful than connectionless for backbone links due to its local error correction ability.) All packets specific to RSPF shall be exchanged inside IP packets using a protocol identifier of <tbd>. Such IP packets shall be sent with a time to live (TTL) value of 1. If broadcast procedures are used, connectionless AX.25 UI frames shall be sent, with a multicast address "QSTRTR-0" in the AX.25 address and an IP address of 44.255.255.255. (A router can also "read the mail" on passing traffic not addressed to it; such procedures are for further study.) Note that in this document, "subnetwork" and "data link" are synonymous, and refer to the layer over which IP packets are exchanged. II. Acquisition of router-router adjacencies For RSPF to operate correctly, all routers must remain reasonably current as to the state of the network at large. This is handled by the propogation of "bulletin" messages among routers. End nodes need not concern themselves with this; they will normally communicate through one "designated" router at any given time, for all (non-adjacent) destinations (not seen by ARP or other lower-layer procedures). All information propogated through the bulletin process begins with each routers' maintenance of an adjacencies table. Each router's adjacency table lists the following information for all other nodes, both routers and end nodes, from which it directly receives packets over a subnetwork (or data link): Adjacent node IP address (i.e., 44.56.0.44) Adjacent node datalink (AX.25) address (i.e., K1IO-0) Datalink used (i.e., AX0) Datalink cost (i.e., 4) Number of packets heard since last RRH update (i.e., 50) Packet sequence number in last RRH update (i.e., 12593) Time of last RRH update (i.e., 2130). II.1. Router-router hello For the backbone to create its topology automatically, there must be a way for routers to discover each other with minimal overhead. For this purpose, a router-router hello (RRH) message is provided. Periodically (as an initial suggestion, shortly before beginning to propogate the periodic links state bulletin to known adjacencies), the router sends out the RRH message to the layer 2 multicast address and IP multicast address (i.e., 44.255.255.255) . RSPF makes no assumption of reciprocity (that links are bidirectional), so receipt of an RRH packet provides the receiver with information about a one-way (received) adjacency. II.2. Connection-oriented procedure If a router uses connection-oriented datalink procedures to its own adjacencies (recommended), then when a router receives this RRH packet, it checks to see if it already has a link to its origin in its own links table. If not, it waits a random period of time (initial suggestion: within the range of 0 -> 10*(link's value of T1, DWAIT or SLOTTIME - tbd)) and then attempts to establish an AX.25 connection with the usual SABM; the router responds with the usual UA (link established) or DM (link rejected). If a two-way connection has been established, then both routers add the link to their adjacency tables. While the existence of this route is set reciprocally, the cost of each side of the route is set separately for each side of the connection. Cost is transmitted in the link state packet. Thus the adjacency between two routers is not actually used until the first link state packet exchange has taken place. Loss of an adjacency may be noted by the loss of the AX.25 connection. When this occurs, the router removes the router from its adjacency table and follows the "bad news" procedures outlined below for link state propogation. II.3. Connectionless procedure If a router uses connectionless datalink procedures to its own adjacencies (suitable to low-loss links), then when a router receives an RRH packet, it checks to see if this adjacency is already in its adjacency table. If not, then it is added. Loss of an adjacency may be noted by timeout; if no RRH message is received, and no frames have been received from the adjacent router for a period of time (initial suggestion: slightly over twice the maximum interval between RRH messages), then the adjacency becomes suspect. The router should attempt to exchange a packet with the suspect adjacency; if unsuccessful, the route is marked lost. It may also be marked lost if other attempts to send data through that router fail. (Exact procedures for further study.) Table II-1. Coding of the RRH PDU. 1 2 |0 |8 |6 |4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RSPF Version #| Type (RRH) | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Full IP Address of sending router | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | last packet sent seq. # | flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | plaintext |... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Parameters-- An RSPF Router-Router Hello packet is sent within IP with a type of <tbd>. Each RRH packet contains the following fields: RSPF Version Number: Version number of this protocol (initially 1). Type: Value of 3 for RRH. Checksum: IP-style checksum Address: Full IP address of sending router Last packet sent sequence number: An integer (mod 65535) incremented by 1 for every frame sent by the datalink layer. This allows receiving entities using connectionless procedures to use the automatic link quality measurement technique described below. Flags: The low-order bit is 1 if connectionless datalink is preferred; 0 if connection-oriented is preferred. (Set by system management based upon anticipated link quality.) Other bits are reserved (sent 0). Plaintext: An optional text message (length may be up to maximum size remaining in datalink PDU). II.4. Automatic link quality measurement A connectionless link or subnetwork may have very reliable, or very sporadic, performance. Since there is no procedure for ensuring the reliability of packets sent over a connectionless link, a high rate of packet loss may occur without being detected by the routers. If this loss is high enough, another route may become a better choice; a high enough packet loss rate may be enough to mark a link as "down". Every router shall maintain a count of packets sent over each link. Every time an RRH message is sent, it includes the current value of this counter (modulo 65535). Every router also maintains, in its adjacency table, a count of packets received from this adjacency since the last RRH message and the last received value of that counter. Upon receipt of an RRH message, the recipient router compares the value of the received packet counter with the last received value in the adjacency table. The difference (taking into account wrap-around at the modulus) is compared with the number of packets received since the last RRH message. (This works even if an RRH message is lost.) This packet loss ratio is then used as a link quality metric. (Timestamp is used to compute packet arrival rate.) Connection-oriented data links presumably deliver 100% of attempted packets. A high-quality connectionless link, such as Ethernet/LLC1, will come close to this. However, single-frequency packet radio links are prone to packet loss for several reasons, including hidden transmitters, lack of collision detection, and rf interference. The packet loss ratio is useful in setting link cost, and may also be used to determine whether a link should use connectionless or connection-oriented procedures. If a router reports, in its link update packets, that a given link has a cost of _n_, then its adjacencies (and only its adjacencies) may apply the packet loss ratio to adjust the cost which they maintain in their link state tables. These adjusted costs, rather than the received costs, may then be propogated to other routers. Specific procedures and parameters for this are for further study. III. Acquisition of end node adjacencies Three possible means of determining adjacencies to end nodes are the use of connected-mode AX.25 links, the use of ARP, and the use of a "wiretap" algorithm (see RFC981). Unless a connection mode Data Link layer (with keepalive timers) is used, adjacent nodes may need to send each other messages at regular intervals to ensure that the link is still usable. A procedure is outlined below for routers and end nodes to acquire knowledge of each other. It is assumed that RSPF will not be activated in end nodes; this will permit simpler versions of the IP software to be used. A node that has RSPF support in its software but operates as an end node can also use the router-router connection procedures and simply broadcast its adjacency to the router in a one-entry bulletin with a horizon of one. Such a node may also be simultaneously homed on two or more other routers, unlike true end nodes whose traffic either bypasses RSPF (using ARP) or arrives by way of its associated router. If an end node knows the IP address of the router which will connect it to the network at large, it establishes a connected-mode AX.25 (or other data link layer) connection to the router; the presence of this connection indicates that the node is reachable from that router, which then adds it to its links table and subsequent bulletins. This may, of course, require an ARP exchange in order to acquire the AX.25 (data link layer) address. Alternately, the end node can simply use ARP and use connectionless link procedures. In this case the router should assume that the end node is available until either a rather lengthy timer expires, or the router is unable to make an ARP contact after the ARP timer expires. (A loss of reachability should not be inferred from the ARP timeout.) Routers should periodically broadcast their availability (suggested interval: every 15 minutes) with an AX.25 UI frame sent to QST-0 (the AX.25 broadcast address). A human-readable "unproto" message may go here, allowing individual operators to recognize routers and connect as appropriate. (No specific PDU coding is provided, as the end nodes do not use RSPF.) A router may also choose to use "Promiscuous ARP" to provide service to an end node which is attempting to connect with an IP address reachable by the router. In such a case the router should wait an extra interval after receiving the ARP request because the desired destination may actually be directly reachable; ARP procedures may need to be modified to provide this. Another potential approach is for routers to simply listen to AX.25 traffic on the link and determine who is adjacent to whom. This is the gist of the "wiretap" algorithm in RFC981, which also finds non-adjacent nodes by taking advantage of the source routing found in AX.25 frames. Integration of wiretap into RSPF is for further study. IV. Link state propogation IV.1. Optional multicast/broadcast Packet radio is inherently a broadcast medium. Packet radio networks, however, may be viewed as a collection of individual links which happen to use a broadcast physical medium. It is also possible to exploit the broadcast nature of the medium. RSPF link state propogation procedures allow but do not require such multicasting. If the link uses connectionless procedures for user data packet exchange, then broadcast procedures should be used for link state packet exchange. The converse may not necessarily be true: The threshhold of loss that militates against connectionless transmission of user data may be more stringent than that which call for non-broadcast transmission of link state packets. (Details for further study.) IV.2. Routing update bulletins Routing updates are passed along from router to router via routing update bulletins. Bulletin propogation is designed to guarantee that every node within a given "horizon" receives every routing update message sent out by a given node. Every router originates information about changes in its own adjacencies, as well as periodic retransissions of its entire list of adjacencies. These messages are then propogated among other routers. The router whose adjacency information is being broadcast is called the _reporting router_; this may be some hops away from the routers which forward it to their own adjacencies. Each reporting router's adjacency updates contain a sequence number (and in some cases, a subsequence number). These sequence numbers are generated by the reporting router and propogated; they are not changed when a bulletin is relayed. They are incrememted by 1 (modulo 65536) every time a new one is generated. Bulletins may also carry incremental change information to previous bulletins. These carry the same sequence number as the full bulletin to which they are reporting incremental changes; each such bulletin has a subsequence number. The subsequence number is zero for a complete routing update bulletin. Every bulletin also has a horizon value, set by the reporting router, associated with each of its constituent links. (A given reporting router may have more than one constituent link, if it is a multi-port router.) Every time a bulletin is propogated, each horizon value is decremented by 1. When it hits zero, the bulletin is not propogated further. Note that for horizon purposes, a bulletin's individual constituent links may have separate horizons; when a link's horizon hits zero, other links' adjacency information from the same reporting router may continue to be propogated. Every router maintains within memory a routers table containing one tuple for every other router on the network. This tuple contains the following elements: IP Address Last received bulletin sequence number Last received bulletin subsequence number Timestamp for when the data was received. This table is used to coordinate the receipt and transmission of bulletins, using either broadcast or non-broadcast procedures. IV.3. Flooding without congestion collapse A bulletin from reporting router X (stating adjacencies seen by X) is sent, initially by X, to every adjacent router A, which passes it along to all of its own adjacencies B. The routing update bulletin (which is loosely based on the Internet EGP (Exterior Gateway Protocol)) may contain one or more routers' adjacency lists, to be forwarded to adjacent routers. This "flooding" is controlled such that no reporting router's adjacency update is sent more than once between the same pair of routers. (A bulletin packet may actually concatenate multiple reporting routers' adjacency information; each is numbered separately, even if transmitted within the same packet. This is done to reduce the overhead of short transmitted packets.) After router A sends its bulletin to B, the recipient router B then looks at the sequence number of each reporting router X's adjacency message and the sequence number of the last received adjacency message that originated from X. If the just-received bulletin is newer (higher number), then it sends a positive acknowledgement to A, and joins in the flooding to its own adjacencies. The horizon is, of course, decremented. If it has the same number, B checks the horizon left in the received bulletin against the horizon left when it previously received the bulletin. In the event that the latest bulletin received had a shorter (lower number) horizon left than the earlier one, it simply acknowledges the (redundant) message. But if the reporting router X's horizon left is greater for the new copy of the bulletin, router B propogates it as if it were a new bulletin. This way, if the bulletin happened to first arrive via a roundabout path, it won't accidentally fail to reach the intended end of its range. If any router B receives from router A a bulletin (from any reporting router X) that contains a lower sequence number than one that had previously arrived at B, then it can be presumed to be "obsolete" information. B replies by sending a bulletin to A with the latest link state information for that node X. As a corrolary, a router may poll for information about a given reporting router by sending a routing update bulletin for that reporting router with a sequence number that is lower than the latest one it actually has received. Said bulletin may contain zero links' information. (Note that since the sequence number is modular, a value of 0 is not correct for this function as 0 is higher than 65535; the "poll" number should be only slightly lower.) IV.4. Non-broadcast bulletin propogation A router may acquire routing information from adjacent routers via point-to-point procedures which treat every adjacency as a separate logical data link. (Such a procedure also works, of course, over point-to-point links such as wirelines.) This tends to have the highest reliability, since connection-oriented data link control procedures can be used to ensure the accuracy and completeness of the data passed over the link. It has the disadvantage of requiring separate transmission of the same data to different adjacent nodes on the same channel. IV.5. Broadcast bulletin propogation When a router is adjacent to several other routers via the same broadcast (i.e., packet radio) channel, retransmission of routing bulletins to each adjacency makes additional use of bandwidth, which may be a scarce resource. A broadcast procedure may be used to pass the bulletin along that link. Note that broadcast propogation of bulletins may be combined with non-broadcast propogation, on a link by link basis. Although packet radio is a broadcast medium, it is not a perfect one: The reliability of multicast packets is not as high as for wireline LANs. This leads to the need for a unique broadcast "flooding" technique. A routing update bulletin is broadcast to the Layer 2 multicast AX.25 address "QSTRTR-0" using the IP multicast address (in AMPRnet, 44.255.255.255). Individual recipient routers may or may not receive the entire bulletin, since there is no acknowledgement possible. In a non-broadcast message sent via a connection-oriented datalink, the entire routing update bulletin can be assumed to have been received intact. Thus, if a given reporting router has originated a new complete list of its adjacencies (new sequence number, subsequence number equals 0), then any adjacency not included is presumed to have ceased to exist. Such a presumption is not always possible when a bulletin was received via unacknowledged broadcast, as the message might have been received in part. This should not make the partially received bulletin unusable. A bulletin is transmitted in one or more packets, each of which constitutes a numbered fragment of the whole bulletin. Within the transmitted bulletin, each individual reporting router's node-header contains the number of links being reported on, and each link-header contains the number of adjacencies being reported on. Since each IP packet that makes up a bulletin contains a fragment number, it is also possible to determine whether or not a fragment was lost. In connection-oriented non-broadcast procedures, a routing update bulletin (not a partial one with a subsequence number >0) provides a complete list of adjacencies known to the sending node, except where the horizon is exceeded. Absence of a previouly-known adjacency then implies that the adjacency has been lost. However, that assumption does not apply to fragmented bulletins received in part, which can occur with broadcast procedures: If a fragment was lost before reaching the end of a given reporting router's portion of the bulletin, then the absence of a previously-known adjacency in the received bulletin does not mean that the adjacency was lost. RSPF procedures dictate that routing update bulletins with a subsequence number of zero are presumed to be complete lists of adjacencies from their originators; higher subsequence numbers represent incremental changes only. In the broadcast procedures, a routing update bulletin with a subsequence number of zero, if received only in part, is treated as an incremental change bulletin. Thus, the recipient compares the sequence number with the previously received sequence number from that reporting router. If it is higher than the previously received one, then its adjacencies are used. If it was received in full, and had a subsequence number of 0, then its previous adjacencies are replaced. If it was not received in full, or if its subsequence number was not zero, then its previous adjacencies are left intact unless explicitly changed by the received bulletin. If a bulletin is received in full, then the routers table is updated with the latest sequence and/or subsequence number and timestamp. If it is received in part, then these entries are not updated. After the bulletin has been completely transmitted, a recipient node which has received a partial update from a reporting node may poll for that node's latest information, by using the (now known to be obsolete) sequence number for that router in its router table in a bulletin, with zero links for that reporting router. (This is the same polling procedure used for non-broadcast links.) Note that if a fragment is lost, a reporting router whose node-header is in the lost fragment will of course remain unchanged in the recipient's data base. Furthermore, any data received in subsequent fragments, prior to a node-header, will also be meaningless. The last adjacency of the last link in a reporting router's bulletin will have the Last flag set to 1, signaling that following the address, a node header follows. IV.6. Routing update bulletin packets A routing update bulletin packet (Table IV.1) may contain several different reporting routers' updated link state information, concatenated into one message, with a different sequence number for each source (reporting router). One of the sources, of course, may be the local router. Each router's link state information is further broken down by link, which allows its backbone routing information to be propogated separately from its local end node adjacency information. Incremental changes (good news and bad news) Bulletins that only report changes in state come in two flavors. Good News bulletins inform other routers that an adjacency has been noted. Bad News bulletins inform them that an adjacency has been lost. If an end node establishes a connection with a router whose node group default addresses (based on the significant bit count) already include that end station, however, no bulletin need be sent out, as packets to that end station will already be routed correctly. Theoretically, a router could send out a new full routing update bulletin every time it gained or lost an adjacency. However, the use of shorter Good News and Bad News packets, which do not contain a full routing update, allow the network to remain relatively current with less transmitted traffic. Good news and bad news packets are propogated like other packets, but are not originated by the same rules. While full routing bulletins are originated based upon a timer, good news packets are transmitted immediately. This enables new links to be useful quickly. Bad news, however, should not travel as fast: A node should cache any bad news message for a time (initial recommendation for this timer: 60 seconds) while waiting for the link to come back up. This helps keep the network stable; if the node receives a packet destined for the lost destination, it may send an ICMP "host unreachable" message to the originator of the packet, unless it can reroute the packet itself (as may happen with the loss of a backbone link where others exist). Because good news and bad news messages represent changes to the last full link state bulletin propogated, but do not purport to completely represent a node's link states, they carry bulletin subsequence numbers. These use the last bulletin sequence number originated by the reporting router, but the sub-sequence value increments from 1. (A full link state packet has a sub-sequence value of 0, and resets the subsequence counter.) Routes to nearby destinations Sometimes more than one router will serve the same area (determined by the significant bits' matching), and they will need to know which one has the better path to a given station. These adjacency messages may only require a short horizon, as will Good News and Bad News packets which refer to end nodes going on and off the air. Higher horizons are needed for backbone routers. Table IV.1. Full routing update (link state packet) bulletin: 1 2 |0 |8 |6 |4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ---- | RSPF Version #| Type | fragment # | fragment total| packet +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -hdr | Checksum | sync octet | # nodes below | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ---- | Reporting node #1 full IP Address | node +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -hdr | Node 1 bulletin sequence # | sub-sequence #| # links | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ---- | horizon left | ERP factor | link cost | #adjacencies | link +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ _-hdr_ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 1,1,1 IP address | adj. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 1,1,2 IP address | adj. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 1,1,n IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | horizon left | ERP factor | link cost | #adjacencies | link +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 1,2,1 IP address | adj. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 1,2,n IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reporting node #2 full IP Address | node +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Node 2 bulletin sequence # | sub-sequence #| # links | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | horizon left | ERP factor | link cost | #adjacencies | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 2,1,1 IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 2,1,2 IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | horizon left | ERP factor | link cost | #adjacencies | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 2,2,1 IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |significant bits| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Adjacent node(s) 2,2,n IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reporting node #n full IP address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Node n bulletin sequence # | sub-sequence #| # links | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ etc. Parameters-- An RSPF bulletin packet is sent within IP with a type of <tbd>. Each routing update packet contains a packet header that contains: RSPF Version Number: Version number of the protocol (initially 1). Type: (Value 1 for Full Routing Update, 2 for Partial Routing Update.) Fragment Number: States which fragment, in a segmented message, this is, beginning at 1. Non-fragmented messages use 1. Fragment total: Total number of fragments in message; 1 if not fragmented. Checksum: IP-style checksum. Sync octet: Which octet in this packet (counting from this byte as byte 0) is the beginning of a node-header. If 0, this fragment has no node-header. Non-fragmented messages use a value of 2 (because one byte follows in packet header). Number of nodes reporting: The number of reporting routers in the messages that follows (this packet or a sequence of packets). The node-header (for each reporting router) contains 8 octets: Reporting router #n full IP address: The IP address of the router whose adjacencies are being reported below. Bulletin sequence number: When a bulletin is passed along, this number is NOT changed; every new bulletin from a given Reporting router has a value 1 higher than the previous bulletin from that reporting router. (Note: This is modulo 65536, so implementations must cope with the "wrap around" and consider values below, say, 100, to be "higher" than values above, say, 65400. Between 100 and 65400, modular arithmetic is NOT used.) Sub-sequence number: Good news and bad news packets representing incremental changes from the last full report increment this value by 1; it is 0 for full bulletins. # links: The number of different cost-horizon values (typically, but not necessarily, representing different types of link in a mulitiport gateway) being reported below; the following four octets are the header for each link. [For each reporting router, adjacencies are reported separately by link cost. This is the received cost value for that data link, after any adjustment that may be based upon packet loss ratio. Adjacencies are also reported separately by horizon, even if the cost is the same. It does not matter at this point if there are multiple RF or other links if their cost and horizon are the same. Likewise, separate received costs or horizons on one link will be treated separately and such adjacencies will be grouped under separate link headers:] Horizon left: This number is decremented every time a routing update bulletin is passed along; when it reaches 0, it is not passed further. Link cost: A "figure of merit" for each link, rising with slower/poorer links. This is the number whose total is minimized by SPF. The range is 1-127. As a starting point, a 56000 bps fdx backbone link might have a value of 1, a 4800bps hdx link a value of 4, a 1200bps hdx link a value of 8 and a 300 bps hf "wormhole" a value of 16. A value of 255 denotes the loss of a link; this is found in Bad News packets. These values should be coordinated network-wide; adjusting them will change the way packets are routed by RSPF. Number of adjacencies: The number of adjacencies to be listed for that reporting node. ERP Factor: Used for "approximating" a route; contains the number of significant bits for which a given node can be presumed to be a valid router, even if it doesn't report in detail. A low factor = wider coverage area; thus ERP of 16 means that if NO other match is found for a given address, this router will try to handle it if it matches 16 bits. Basically a handle for future enhancements; its use will not be required in the initial release of RSPF. For each adjacency of the given link cost, the following is provided: Significant bits: The number of bits used for node group routing purposes. Usually 32 but may be lower if a given link purports to serve all end nodes in an area defined using the most-matched-bits node group convention. Higher numbers of bits matched take a higher priority than least cost. This uses the low-order 5 bits of the octet. Last-flag: If this is the last adjacency in the list for this reporting router, this value is 1; otherwise it is 0. (This occupies the high-order bit of the significant bits octet.) Full IP address: The full IP address for this node. Note that the n,n,n vector within the bulletin has three fields in the above representation: Reporting router within bulletin packet, link cost/horizon within reporting router, and reporting adjacency with that link cost/horizon. IV.7. Fragmentation In a moderate to large network, a full routing update can easily exceed the maximum size of an AX.25 frame or IP packet. The RSPF Routing Update message, however, may be sent in fragments. No specific protocol is required for this; bulletins are not assumed to be terminated by a packet boundary. Each fragment is, however, numbered in the packet header, along with an indication of the number of fragments to be sent. In order to permit parsing of the incoming fragments by routers who are using unacknowledged broadcast information (with the high likelihood of lost fragments), every bulletin's packet header contains a sync octet indicator. This indicates which byte, where the next byte in the header is byte 1, is the beginning of a node header. If a previous fragment was lost, the receiver should ignore the number of bytes indicated in the sync octet before resuming parsing of the packet. (If the fragment does not exceed 255 bytes, this will always be sufficient. It is recognized that long packets may not be able to use this mechanism reliably, and a value of "0" should be used to indicate that no sync is possible within this fragment.) Each routing update bulletin takes the form of a three- dimensional list, with the dimensions being reporting router, link and adjacency. A given bulletin may report on link state from one or more remote nodes, as well as the sending node. Each node may have one or more links; each link may have one or more adjacencies. Packets may not be fragmented within adjacencies, but may be fragmented after an adjacency's address and before the next adjacency's significant bits field. The next fragment, in a new packet, simply begins where the last one left off; the receiver knows how many more to expect based upon the node and link count in the respective node-header and link-header. A router may not start sending a new Routing Update message of any kind to any given IP address until all fragments of a previous message have been transmitted. IV.8. Bulletin Timers The timers used for bulletin updates must be a compromise between maintaing the network's current state and the traffic required to do so. An initial suggestion: Good news messages should be initiated within a few seconds and bad news messages should be initiated within one minute, with relatively short horizons on the bulletins (i.e., the routers within the region). Full routing updates with normal horizons should be sent out every 30 minutes. If the network is small, more frequent updates may be possible; too frequent updates risk choking the network with update traffic. V. The Shortest Path First spanning tree algorithm As a routing node comes onto the network, it acquires over time a complete list of adjacencies between other nodes on the network except as limited by the "horizon". Each adjacency (point to point link) has a "cost" associated with it. It should be noted that adjacencies, for the purposes of this algorithm, are "logical" and not necessarily physical; if it occurs below IP (as in AX.25 digipieating or NET/ROM), the two ends of the link are still adjacent. (Adjacency at the IP internet layer is based upon subnetworks, which may contain their own links.) Cost is set for the transmit side of each link; if the receiver and transmitter do not agree on cost, the network may apply different routes for packets going in opposite directions between the same two end nodes. (This is not a problem. In a radio environment, one should NOT assume reciprocity across a link.) Each router builds a _link state table_ that lists, for every known link (from every reporting router), the two ends and the cost of that end of the link. The ends are listed by an IP address and, for the destination IP address, a number of significant bits. This is what's updated by the bulletins and by good news/bad news messages. Source IP address Dest. IP addr/bits Cost 44.56.0.44 44.56.0.128/32 5 44.56.0.44 44.56.0.12/25 10 The goal of the algorithm is to build a _paths table_ which lists, for every reachable node (or node group approximation of fewer than 32 bits) on the network, that node address (or node group address and number of significant bits), the adjacent node used to get there (i.e., where you blast the packets next), and the total cost of the path. (This feeds the Route table in the IP Route module in NET.) Every router contains, for the purposes of building the tree, a _trial table_ that has three entries: Destination address/bits, adjacent node, and cost of this path. The paths table additionally has one more entry, the _parent_ node, which is the last hop before the destination. Thus in a path A -> B -> C -> D -> E, home router A views E as the destination, D as the parent, and B as the adjacency. Note that in the path from A to B, A is the parent node. Begin building the paths table by using the home router as the first node on the paths table. The cost to reach yourself is always 0, so make the first entry on the paths table as follows: Source=self, destination=self, parent=self, and cost=0. From here on in, parent is always (by definition of the SPF spanning tree) the node most recently added to the paths table. Destination Adjacent Parent Cost 44.56.0.128 44.56.0.128 44.56.0.44 5 44.56.0.131 44.56.0.128 44.56.0.128 10 44.56.0.200 44.56.0.128 44.56.0.131 15 Paths Table showing relationship between destination, parent and adjacent nodes, where the home node is 44.56.0.44 and 44.56.0.200 is three hops away; all hops here have a cost of 5. SCAN_THE_LINKS: The home router now scans its links table looking for all nodes (routers and end nodes) that are adjacent to the parent node, except for links to nodes which are already on the paths table. (It is generally fastest to build the paths table by scanning the links table in order of increasing link cost.) Each of these new nodes is added to the trial table, except for the parent node (which is already on the paths table, so it can't possibly have a shorter path). The value of cost used on the trial table is the cost of the link from the parent to the destination plus the cost to the parent node (taken from the paths table). A node may only appear once in the trial table at any given time. If an adjacency is found to a node that is already on the trial table, a new entry replaces the existing entry if and only if the new total cost is lower. If the cost is higher, it is ignored. (If the costs are equal, then a "tiebreaker" is determined by treating the lower-numbered IP address as the lower cost. This will simply make the spanning tree more deterministic in case of tie.) Thus the trial table always contains the lowest cost path to each destination found so far. Once all of the links from the parent node have had their chance to go onto the trial table, scan trial table for the _one_ entry with the lowest total cost. In case of tie, pick the lower IP address (again just to be deterministic). Move this node to the paths table; guaranteed, there'll be no cheaper path to that node! The adjacency used for that node in the paths table is the adjacency to its parent node. Note that the parent _must_ already be on the paths table since that's the source of the parent; you're working your way outward. This new addition to the paths table becomes the new parent node. Repeast the procedure from SCAN_THE_LINKS above until there are no nodes left on the trial table. This means you've completed the spanning tree and have a least-cost path to every other node. One of the router parameters is maximum_cost. If the cost to a given parent node exceeds this value, the procedure stops, since all subsequent entries in the route table will have a higher cost. This value relates to the time-to-live parameter of the IP packet: It makes little sense to compute a routing table to nodes that cannot be reached within time-to-live. (Ideally, ttl will be implemented using a timer rather than a node counter, but this is difficult to do with some of the packet radio data link controller implementations; vis., TNCs.) A router should recalculate its routes periodically based upon the current links table. How frequently depends upon the CPU load involved. Initial estimates are that it should be recalculated after receipt of every routing update bulletin: Partial calculations do not save enough CPU time to make them worthwhile. V.2. Filling in the NET routing table The route table in NET (the KA9Q et al implementation of IP) contains, for each entry, the destination address, number of bits, interface, gateway and metric. This is essentially left intact, except that metric is filled in by cost and the routing decision looks for the least cost of all matches. The route table is filled in from the paths table. Since the routing table will be periodically recalculated from scratch, implementation may require two route tables, one "in progress" and one "in service". When the route calculation is complete, the "in progress" table becomes "in service" and the old one is cleared for reuse. This allows packet forwarding to continue while the completed paths table is being converted into the route table. Appendix I. Router parameters Every router must set a number of parameters in order to properly operate. While RSPF builds its routing matrix automatically, overall network efficiency and stability may require some fine-tuning based upon experience. These include parameters set for each data link layer entity (i.e., each radio channel) and for the router in general. Link settings: Set Link cost: This is the cost parameter based upon the link speed and type. Since the overall cost of the end-to-end path is minimized by the RSPF spanning tree, link cost should be set to arrive at the best overall network performance. The legal range is 1-127. This is sent in routing update bulletins. Node settings: Add/Delete Node group: [IPaddr]/bits {cost}. This allows a node to announce its ability to serve a group of nodes, identified using the standard NET convention of address/significant bits. Thus a node group setting of [44.56.0.1]/25 will match all nodes from [44.56.0.1] to [44.56.0.127]. Cost is optional; if set, this cost to will be propogated to reach such nodes; otherwise, the link's default is used. Set horizon link: This sets the horizon value for the node's routing bulletins apropos 32-bit addresses of other connected routers. This should be high enough to propogate across the backbone. Set horizon group: This sets the horizon value for the node's routing bulletins apropos node group addresses (fewer than 32 bits matched). Usually matches the horizon link value. Set horizon local: This sets the horizon vaue for the node's routing bulletins apropos full link addresses (32 bits) within the node group area. This is set to a low value so that only other routers serving the same or overlapping node group(s) will receive these messages. Set horizon portable: This sets the horizon value for the node's routing bulletins apropos full link addresses (32 bits) not within a node group. This allows portable end nodes to have their location in the network propogated farther than the local horizon; this is usually set the same as horizon group.